aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption
نویسندگان
چکیده
Region visual features enhance the generative capability of machines based on features. However, they lack proper interaction-based attentional perceptions and end up with biased or uncorrelated sentences pieces misinformation. In this work, we propose Attribute Interaction-Tensor Product Representation (aiTPR), which is a convenient way gathering more information through orthogonal combination learning interactions as physical entities (tensors) improving captions. Compared to previous works, where add undefined feature spaces, TPR helps maintain sanity in combinations, orthogonality define familiar spaces. We have introduced new concept layer that defines objects their can play crucial role determining different descriptions. The interaction portions contributed heavily better caption quality out-performed various works domain MSCOCO dataset. For first time, notion combining regional image abstracted likelihood embedding for captioning.
منابع مشابه
Learning a Recurrent Visual Representation for Image Caption Generation
In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network. Unlike previous approaches that map both sentences and images to a common embedding, we enable the generation of novel sentences given an image. Using the same model, we can also reconstruct the visual features associated wi...
متن کاملMultimodal Pivots for Image Caption Translation
We present an approach to improve statistical machine translation of image descriptions by multimodal pivots defined in visual space. Image similarity is computed by a convolutional neural network and incorporated into a target-side translation memory retrieval model where descriptions of most similar images are used to rerank translation outputs. Our approach does not depend on the availabilit...
متن کاملCross-Lingual Image Caption Generation
Automatically generating a natural language description of an image is a fundamental problem in artificial intelligence. This task involves both computer vision and natural language processing and is called “image caption generation.” Research on image caption generation has typically focused on taking in an image and generating a caption in English as existing image caption corpora are mostly ...
متن کاملTopic-Specific Image Caption Generation
Recently, image caption which aims to generate a textual description for an image automatically has attracted researchers from various fields. Encouraging performance has been achieved by applying deep neural networks. Most of these works aim at generating a single caption which may be incomprehensive, especially for complex images. This paper proposes a topic-specific multi-caption generator, ...
متن کاملA Note on Tensor Product of Graphs
Let $G$ and $H$ be graphs. The tensor product $Gotimes H$ of $G$ and $H$ has vertex set $V(Gotimes H)=V(G)times V(H)$ and edge set $E(Gotimes H)={(a,b)(c,d)| acin E(G):: and:: bdin E(H)}$. In this paper, some results on this product are obtained by which it is possible to compute the Wiener and Hyper Wiener indices of $K_n otimes G$.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Processing Letters
سال: 2021
ISSN: ['1573-773X', '1370-4621']
DOI: https://doi.org/10.1007/s11063-021-10438-5